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Description 

[0001 ] The present invention relates to a computer system, and in particular to a computer system with a segmented 
bus according to the precharacterizing clause of the first claim. 

5 [0002] The architecture of a computer system typically comprises a bus structure consisting of a plurality of trans- 
mission lines to which various units are connected in parallel. In a computer system which includes a large number of 
units, as in a multi-processor system for example, the physical length of the bus becomes rather large. A disadvantage 
of this structure consists in the fact that the length of the bus increases the signal propagation time; this reduces the 
operating frequency of the bus, since the duration of an operating cycle is inevitably greater than this propagation time. 

10 Furthermore, since the same data item is distributed simultaneously to all the units connected to the bus, the structure 
is affected by the electrical load (input impedance) introduced by these units; this makes it necessary to use driver 
circuits with relatively high power and consequently high consumption, and creates a non-uniform distribution of the 
electrical load which may give rise to phenomena of reflection. The bus therefore has a low transfer rate, which has a 
marked effect on the performance of the whole computer system. 

15 [0003] A further disadvantage is manifested in the case in which the bus (known as the remote bus) is used to connect 
nodes which comprise different units interconnected by means of a further bus (called the local bus). 
[0004] The nodes are connected to the system or remote bus by a device which acts as a bridge between the local 
bus and the remote bus. 

[0005] Each node corresponds to a single load (that of the bridge) connected to the remote bus. 
20 [0006] In this way it is possible to reduce the number of loads connected to the bus and to improve its performance. 
[0007] However, the remote bus generally has a greater length than the local buses, and therefore its operating 
speed is lower; this means that whenever a node accesses the remote bus it is necessary to introduce a latency period 
of a few operating cycles of the local bus. 

[0008] A particular type of computer system is described by PATENT ABSTRACTS OF JAPAN vol. 8, no. 223 (P307), 
25 12 October 1984 & JP 59 106021 A (OKI DENKI KOGYO KK), 19 June 1984. This document discloses a computer 
system wherein central processing units, main storage devices, and input/output devices are connecting to a bus 
connecting circuit trough internal busses. Moreover a bus use right determining circuit is provided which accepts the 
bus use request signals and generates direction control signals to be fed to the connecting circuit. 
[0009] The object of the present invention is to overcome the aforesaid disadvantages. To achieve this object, a 
30 computer system as defined in the independent claim 1 is proposed. 

[0010] Essentially, the system bus, or remote bus, is divided into a plurality of segments of reduced length, linked in 
series and interconnected by pairs of buffer registers which transfer data from one bus segment to those immediately 
adjacent, in one or other of the two possible directions (for this purpose, the interconnection between two segments 
is provided by pairs of buffers, one for the transfer of data in one direction, and the other for the transfer in the opposite 
35 direction). 

[001 1] Documents EP-A-0446039 describes a computer system of the type above indicated and reflecting the content 
of the pre-characterizing portion of claim 1 . This document discloses a computer system employing a multilevel hier- 
archical bus architecture provided with bus segments consisting of a higher level buses joined to units of the system 
by means of respective lower level buses. The buses may exchange signals temporarily stored in registers. 
40 [0012] In accordance with the invention, the buffers are controlled by an arbitration unit, timed by a periodic clock 
signal, to store the data present in one bus segment in one period of the clock signal, with the leading edge of the clock 
signal which terminates the period and starts the next, and to transfer the data thus stored to the adjacent bus segment 
with the same leading edge of the clock signal. 

[0013] It is thus evident that at least N-1 periods of the clock signal are required to transfer a data item along N 
45 concatenated bus segments. 

[001 4] However, it is evident that up to N different data items may pass simultaneously through the different segments 

of the bus in both directions, with a substantial increase in the transfer rate. 

[0015] The arbitration unit, using suitable arbitration algorithms, determines the order in which the different data 
items are transferred from one segment to another in such away as to provide the best possible transfer rate in different 
50 circumstances. 

[0016] These and further characteristics and advantages of the computer system according to the present invention 
are made clear by the following description of a preferred embodiment of the invention, supplied for guidance and 
without restriction, with reference to the attached figures, in which: 

55 Fig. 1 is a block diagram of the computer system according to the present invention; 

Fig. 2 shows an example in the form of a time diagram of the data transfer in the computer system according to 
the present invention; 

Fig. 3 is a detailed block diagram of a preferred embodiment of the arbitration and bus control unit for the system 
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shown in Figure 1 . 

[0017] With particular reference to Fig. 1, a computer system 100 includes a multi-point bus structure 105 for the 
transmission of a data item (a value, an address or a command) consisting of one or more binary digits (or "bits"), to 

5 which various units U1-U4 are connected in parallel. The bus 105 is of the synchronous type, in which the time intervals 
of occupation of the bus 1 05 have a predetermined duration determined by the period of a clocl^ signal. The computer 
system 100 is a multi-processor system, in which the bus 105 is a remote bus and in which one or more of the units 
U1-U4 consists of an interface unit (bridge) for connection with a local bus (not shown in the figure); the local bus is 
connected to one or more computing units or processors and other units (for example, a memory of the cache type, a 

10 local memory, an input/output channel, and similar) which form a node (or cluster). However, the present invention is 
also suitable for use in different structures, for example in a computer system without local buses, in a single-processor 
system, and similar. 

[001 8] According to the present invention, the bus 1 05 of the computer system 1 00 has a segmented structure and 
is divided into a plurality n (where n > 2) of segments B1-B4 (four in the present example); each bus segment B(i) 
15 (where 1 = 1 ... n) is connected to a variable number (which at one extreme may be zero) of corresponding units U(i). 
[0019] If more than one unit is connected to one bus segment, a local arbitration unit, not illustrated, ensures that 
only one unit can request access to the segment at any time. The different units connected to the same bus segment 
can therefore be considered as a single unit. 

[0020] The bus segments 81 -84 are concatenated by means of memory elements which operate as impedance 
20 separators, or buffers, in such a way that they electrically decouple the adjacent bus portions 81 -84. Each bus segment 
B(i) (with the exception of the first at the left-hand end of the series) is connected, at its left-hand end, to a rightward- 
passing buffer (more briefly, a "right buffer") Br(i-1 ) which receives a data item from a preceding bus segment 8(i-1 ), 
stores it and subsequently transfers it to the bus segment B(i); it is also connected to a leftward-passing buffer (more 
briefly, a "left buffer") Bl(i) which receives a data item from a following bus portion 8(1+1), stores it and subsequently 
25 transfers it to the bus portion 8(i). 

[0021] The other end of each bus segment (with the exception of the final right-hand segment of the series) is con- 
nected to a rightward-passing buffer Br(i) which receives a data item from the segment, stores it and subsequently 
transfers it to the bus segment B(i+1), and is also connected to a leftward-passing buffer Bl(i) which receives a data 
item from the following segment B(i+1), stores it and subsequently transfers it to the segment 8(1). 
30 [0022] As stated previously, the two end segments of the series, each of which is connected to a buffer register at 
one end only, constitute an exception. 

[0023] The present invention is, however, also suitable for application to a ring bus, in which the first and last segments 
of the bus are interconnected by means of a further left buffer and a further right buffer. Preferably, each right buffer 
8r1 -Br3 includes a bank of registers of the FIFO (First In First Out) type. 
35 [0024] Each bank consists of two or more ordered registers. The various registers are controlled by means of a 
suitable control circuit in such a way that the data are loaded into the first free register in the order in the bank or into 
the one which is about to become free. 

[0025] The present invention is, however, also suitable for application to different memory structures, such as a single 

register, and similar. 

40 [0026] The computer system 100 also includes a central arbitration unit (ARB) 160 to control access, for each bus 
segment B(i), to transfer a data item to this bus segment B(i) from the connected units U(i), the corresponding right 
buffer 8r(i-1) and the left buffer 8l(i) in a mutually exclusive way; the arbitration unit 160 also controls the loading of 
the data item present in the bus segment B(i) into the left buffer 8l(i-1) and into the right buffer 8r(i). The arbitration 
unit 160 is connected to each unit U1-U4 and to each of the buffers Br1-Br3, 811-813 to send a corresponding signal 

45 enabling access to the bus segments. 

[0027] Preferably, the connection to the units U2, U3 and to the buffers 8r1 -8r3, 811 -813 disposed near the arbitration 
unit 1 60 is made directly by means of a dedicated line 1 61 . However, the connection to the more remote units U1 and 
U4, for which the propagation time of these signals and consequently their time dispersion is greater, is made by using 
an intermediate register (buffer) 167, timed by a clock signal CLK; in particular, a dedicated line 168 is used to send 

50 the enabling signals to the buffers 167, where these signals are stored and subsequently transmitted to the units U1 , 
U4 by means of a further dedicated line 1 69. Consequently the transmission of the enabling signals from the arbitration 
unit 160 to the units U1 and U4 requires two clock periods, but reduces the dispersion of the propagation times. With 
this embodiment it is possible to keep the clock period shorter and therefore to keep the operating frequency of the 
bus 1 05 higher. Similar considerations apply in the case in which two or more intermediate buffers are provided for the 

55 connection of the arbitration unit to the more remote units and buffers. 

[0028] The clock signal for timing the computer system 1 00 is generated by a timer unit (CLK) 1 70. The clock signal 
is distributed directly to the arbitration unit 160, to the intermediate buffers 167, to the units U2-U3 and to the buffers 
8r1 -8r3, 811 -813 disposed near the timer unit 1 70, in such away as to ensure its synchronous reception by the various 
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timed units and buffers. 

[0029] Preferably, tine said c\ocW signal is also sent to a phase-locked loop (or PLL) 175, or to other equivalent 
devices, which regenerates the clock signal locally in such away as to synchronize it with the signal produced by the 
timer unit 1 70. The clock signal synchronized by the phase-locked loop 1 75 is then distributed to the units U1 and U4. 
5 The clock signal is thus in phase at the various points in space of the computer system 100. Similar considerations 
are applicable in the case in which further PLLs in cascade connection are provided to distribute the clock signal to 
units and buffers more remote from the timer unit 170. 

[0030] To describe the operation of the computer system described above, reference will be made to the time diagram 
of an example of data transfer in Fig. 2 (the elements previously shown in Fig. 1 are identified by the same reference 

10 numbers or symbols). We shall consider an initial situation at the instant t1 in which one unit U(i) for each bus segment 
B(i) requests access to the bus 1 05 to transfer a data item Di. Each unit U(i) simultaneously accesses the corresponding 
bus segment B(i) to which the data item Di is transferred. At the same time, the access paths shown in bold lines are 
enabled, so that, at the leading edge of the clock signal at a subsequent instant t2, the data item DI is loaded into the 
buffer Bri , the data item D2 is loaded into buffers BI1 and Br2, the data item D3 is loaded into buffers BI2 and BIS, and 

15 the data item D4 is loaded into the buffer BIS. During the clock period t2 there is no access to the bus segments B1 -B4, 
since, to avoid transitory problems of contention during the changeover from access by one agent (connected units Ui 
or transfer buffers Bri, Bli) to access by another different agent, with possible interference between the signals sent by 
different agents, it is necessary to separate such write accesses by at least one intermediate recovery cycle. If we now 
consider an instant tS, the accesses shown in bold lines are enabled, so that the data item D2 is transferred to the bus 

20 segment B1 , the data item DI is transferred to the bus segment B2, the data item D4 is transferred to the bus segment 
B3 and the data item D3 is transferred to the bus segment B4; at the same time, the command is given for the reading 
and loading (shown in bold lines) of the buffers Br2 and BI2, in which the data items DI and D4 respectively are therefore 
loaded (at the leading edge of the clock signal at a subsequent instant t4). During the clock period t4, there is no access 
to the bus segments B1 -B4; this enables the bus to be recovered. Similarly, at the instant t5 the data item D3 is trans- 

25 ferred to the bus segment B2 and the data Item D2 is transferred to the bus segment B3; the loading of the buffers 
BI1, BrS, in which the data items DS and D2 respectively are loaded (at the leading edge at an instant t6), is also 
enabled. During the same clock period t6, the data item DS is transferred to the bus segment B1 , the data item D4 is 
transferred to the bus segment B2, the data item DI is transferred to the bus segment B3 and the data item D2 is 
transferred to the bus segment B4; it should be noted that, in this case, no bus recovery cycle is necessary, since 

30 access to the bus segments B1-B4 is granted to the same agents. At the same time, the reading by the buffers BI1 
and BrS, in which the data items D4 and DI respectively are loaded (at the leading edge at an instant t7), is also 
enabled. During the same clock period t7 (without any recovery cycle) the data items DI and D4 are therefore trans- 
ferred to the bus segments B4 and B1 correspondingly and the data transfer terminates at the instant t8 at which the 
bus 105 becomes free again. 

35 [0031] It may be noted from the example described above that each individual data transfer operation on the bus 
1 05 requires a greater number of periods of the clock signal for the transfer of the data item between the various bus 
segments. 

[0032] However, the clock signal frequency may be increased by comparison with that of a conventional non-seg- 
mented bus, so that the latency period, although variable, has a mean value equal to that of a conventional bus of 
40 equal length. 

[0033] The structure according to the present invention also enables a multiplicity of accesses to the bus 1 05 to be 
controlled simultaneously and enables faster bus segments to be used, which also reduces the waiting cycles of any 
local buses. For example, let us consider a remote bus with a length of 1 .5 m with a frequency of 25 MHz to which 
local buses with an operating frequency of 100 MHz are connected; in this situation, at least S waiting cycles on the 
45 local bus are required for each access to the remote bus. In the structure described above, however, four remote bus 
segments with a length of 40 cm each and a frequency of 100 MHz are used, so that the speed of each segment of 
the remote bus is four times greater and no waiting cycle is necessary (or at least a lower number of waiting cycles is 
necessary) in the local bus. 

[0034] For greater clarity, it may be noted that if it is always the same agent that is obtaining access to one bus 
50 segment, without conflict with other agents, the data are propagated in "pipeline" mode along the bus with an access 
frequency which is reduced only if different agents request and obtain access to the bus segments, thereby entering 

into competition with one another. 

[0035] Additionally, if it is always the same agent that obtains access to one bus segment, there is no risk of contention 
between the signals present in the segment in successive periods and there is no need to separate the consecutive 
55 accesses by a recovery period. 

[0036] The computer system according to the present invention therefore has a higher overall operating speed; 
moreover, this result is obtained with a particularly compact and simple structure. 

[0037] We may now consider the criteria according to which the arbitration unit ARB 160 grants access to the bus 
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segments by a plurality of competing agents, in such a way that the data flow through the various bus segments is 
optimized. 

[0038] For the sake of clarity, it will be useful to proceed in steps. 
5 GENERAL CRITERIA. 

[0039] Although the external agents which submit requests for access to the system bus are only the processors 
connected to the different segments of the bus (and, if there is more than one processor for each segment, these are 
treated as a single processor as a result of the action of local arbitration units, one for each segment) the buffers Br(i) 

10 and Bl(i) interconnecting the different segments can also be considered as agents which submit requests for access. 
[0040] For this purpose the arbitration unit 160 comprises within it a model of the buffers and of the segments, 
consisting of flag storage registers, whose state describes the state of the buffers and segments. 
[0041] For example, a buffer is represented by a first flag, which, when set, indicates that the buffer is empty, and 
by a second flag, which, when set, indicates that the buffer is full; another flag indicates that a bus segment B(i) is 

15 occupied, and so on. 

[0042] These data are used by the arbitration unit to control the data flow. 

[0043] In particular, the flags indicating non-empty buffers may be considered, to all intents and purposes, as signals 
requesting access to a bus segment, generated inside the arbitration unit and arbitrated by the unit. 
[0044] Since the arbitration of access to the segments of the bus is a collective process which takes into account 
20 the state of the various segments, the arbitration unit may be considered as consisting of N arbitration sub-units, each 
dedicated to the arbitration of access to one bus segment. 

[0045] Consequently, the requests for access to a bus segment B(i) may originate from: 

Br(i-1): right buffer 
25 Bl(i): left buffer 

P(i): processor or node connected to the segment B(i). 

[0046] For the sake of simplicity and clarity, each access request is given the name of the agent which produces it. 
[0047] P(i) is assigned a lower priority than the requests of the two buffers, to prevent obstruction of the flow of data 

30 already present in the bus segments or in the buffers. 

[0048] However, to ensure that access to the segment by the buffers does not always take priority over the requests 
of the processor, the arbitration unit 160 implements an "unfairness" or forcing algorithm which always guarantees 
access to the segment after a predetermined number, for example 6, of accesses to the buffers have been granted. 
[0049] A second criterion which is used consists in giving priority to an agent which has obtained access to the 

35 segment, this agent being granted priority access for as long as it requests it, even in subsequent arbitration cycles. 
[0050] With this arrangement, it is unnecessary to separate the different accesses to the segment with recovery 
periods, and the use of the bus is optimized. It may also be noted that this criterion is the automatic consequence of 
the limitations imposed to prevent possible contention between the agents. 

[0051] Another consequence of this approach, however, is that the agent which has taken possession of the bus 
40 tends to monopolize it to the detriment of the other agents. 

[0052] To prevent this, the arbitration unit 160 implements a fairness algorithm, as a result of which, if an agent, 
despite access requests from another agent, has obtained access to the bus segment for a predetermined number of 
successive periods, for example 3, a mask is generated which prevents the recognition of further access requests from 
the same agent. 

45 [0053] When the above criteria are not applicable, the buffers have the same priority, and, in order to settle access 
conflicts, the arbitration unit implements a round robin mechanism, to guarantee that, statistically, the two buffers have 

the same possibilities of access. 

[0054] The round robin mechanism is of the global type; in other words, it operates not between the two buffer agents 
of a segment, but simultaneously among all the right buffers with respect to the left buffers. 
50 [0055] This solution meets the criterion of guaranteeing that no one buffer takes priority over the others over a period 
of time, and also promotes the simultaneous passage of a transaction from one bus segment to the next, under certain 
conditions, even if the buffer of arrival is temporarily occupied. 

[0056] It is evident that, for a buffer agent to be able to place data in a bus segment, it is a prerequisite that the 
destination buffer is available to receive and store the data placed in the bus segment, and if it is not free there must 
55 be a certainty that it will become free in the same time period in which the data are placed in the bus segment. 

[0057] For example, if the right buffers (or even only one sequence of them) are fully occupied, the buffer Br(i-1 ) can 
send a data item to the bus B(i) if the buffer Br(i) which is fully occupied can in turn send a data item to the bus B(i+1 ), 
and the buffer Br(i+1) which is fully occupied can in turn send a transaction to the bus B(i+2), and so on. 
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[0058] It is therefore possible to form a train of transactions which move simultaneously from one bus segment to 
the next, in the same direction, provided that all the right buffers have priority over the left buffers, or vice versa, at the 
same time. 

[0059] This consideration emphasizes the fact that the different sub-arbitration units into which the arbitration unit 
5 160 may be divided cannot ignore, in their arbitration process, the state of the other sub-arbitration units and the 
arbitration process which these are carrying on concurrently. 

PREFERRED EMBODIMENT OF ARBITRATION UNIT 

10 [0060] The structure and operation of the arbitration unit 160 is clear from these premises; the unit is shown in the 
block diagram in Figure 3 in a preferred embodiment of the many possible ones, which, solely by way of example, 
refers to a bus consisting of four segments. 

[0061] The arbitration unit 160 comprises an input register I. REGISTER 1 , timed by the clock signal CLK, an unfair- 
ness logic unit 3, a round robin logic unit (r.l.R.R.) 4 and four arbitration sub-units 5, 6, 7, 8, one for each bus segment, 
15 and each comprising an output register, of which only the register ©.REGISTER 2 of the unit 6 is shown. 

[0062] Since the units 5, 6, 7, 8 are structurally and functionally equivalent, only the unit 6 is shown in greater detail. 
[0063] The arbitration architecture is of a conventional type: 

[0064] At each pulse of the clock signal CLK, the input register 1 stores the external access requests from the proc- 
essors P(1), P(i), P(i+1), P4 applied to its inputs, and submits them at its output to the unfairness logic unit 3 and to 

20 the arbitration sub-units 5, 6, 7, 8. 

[0065] In the interval between one clock pulse CLK and the next, for example in a time interval of 1 0 ns, the different 
units carry out the arbitration, on the basis of signals exchanged between them, signals received from the unfairness 
unit 3, signals received from the round robin logic 4, and internally generated signals, and, at the clock pulse which 
terminates the time interval, load the result of the arbitration into the output register O. REGISTER 2, setting one and 

25 only one at a time of the signals Br(i-1)OE, P(i)OE, BI(i)OE, which enable the right buffer Br(i-I), the processor P(i), 
and the left buffer Bl(i) respectively to access the bus segment B(i). 

[0066] The unfairness logic unit 3 comprises four sections, each dedicated to one bus segment, and a common 
round robin unfairness logic 9. 

[0067] Of the four subsections, which are identical to each other, only subsection 1 0, dedicated to the bus segment 
30 B(i), is shown in detail. 

[0068] The subsection 1 0 comprises an AND gate 1 1 , an OR gate 1 2 and a counter 1 3. 

[0069] The AND gate 11 receives at a first input the signal P(i) from the output of the register 1, and at a second 
input (through the OR gate 12) the logical OR of a pair of signals EBr(i-1)OE and EBL(i)OE. 

[0070] These two signals, generated by the arbitration sub-unit 6, indicate, when set, the outcome of the current 
35 arbitration, and correspond to the signals Br(i-1)OE and BI(i)OE staticized at the end of the arbitration interval in the 
register O.REGISTER 2. 

[0071] The output of the AND gate 11 is connected to an enabling input of the counter 13 which, when enabled, 
increments with each clock pulse. 

[0072] For a counting value equal to a predetermined value, for example 6, the counter 1 3 sends an unfairness signal 
40 to the logic 9 and is inhibited. 

[0073] The counter 1 3 is reset by the signal P(i)OE, at the output from the register 2, by which access to the segment 
B(i) by the processor P(i) is granted. 

[0074] The unfairness round robin logic 9 arbitrates in a conventional cyclical way between the unfairness signals 
received from the different subsections to ensure that only one of the unfairness signals UN(1)M, UN(i)M, UN(i+1)M, 
45 UN(4)M is set at any time towards the arbitration sub-units. 

[0075] It should be noted, without going into details, that the joint setting of a plurality of unfairness signals may 
cause a deadlock condition. 

[0076] The logic 9 may conveniently also generate a global unfairness signal UN.G which is the logical OR of the 
signals UN(1)M, UN(i)M, UN(i+1)M, UN(4)M. 
50 [0077] This signal is distributed and received by all the arbitration sub-units 5, 6, 7, 8. 

[0078] A preferred embodiment of the round robin logic, among many possible embodiments, is described in Euro- 
pean patent Publication EP-A-0 782 081. 

[0079] The global round robin logic of the buffers (r.l.R.R. 4) is even simpler, and may consist of a simple flip-flop 
switched by the clock signal CLK, to set one of two signals Pr and PI which cyclically grant priority to the right buffers 
55 Br(i) and to the left buffers Bl(i). 

[0080] The two signals Pr and PI are sent to all the arbitration sub-units 5, 6, 7, 8. 
[0081] The sub-arbitration unit 6 will now be considered in more detail. 

[0082] It comprises a right buffer fairness logic consisting essentially of a counter 14, an AND gate 15 and an OR 
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gate 16, a left buffer fairness logic consisting essentially of a counter 17, an AND gate 18 and an OR gate 19, a 
processor fairness logic consisting essentially of a counter 20, an AND gate 21 and two OR gates 22, 23, a state 
machine 24 which as a whole is termed MODEL, to store or describe the state of the segment B(i) and of the associated 
right and left buffers, an arbitration combination logic 25 and the output register O. REGISTER 2. 
5 [0083] The MODEL logic 24, timed by the clock signal CLK, describes the state of the buffers Br(i-1 ), Bl(i) and of the 
bus segment B(i) in the course of each period of the clock signal as a result of the preceding state and of the operations 
set in each period with the setting in the output register 2 of one of the signals Br(i-1)OE, BI(i)OE and P(i)OE which 
are applied to the input of the logic 24. 

[0084] Essentially, the logic 24 produces the following signals at the output: 

10 

B(i)E, indicating, when set, that the bus segment B(i) is free. 
Br(i-1)E, indicating, when set, that the buffer Br(i-I) is empty. 
Br(i-1)F, indicating, when set, that the buffer Br(i-I) is full. 
BI(i)E, indicating, when set, that the buffer Bl(i) is empty. 
15 BI(i)F, indicating, when set, that the buffer Bl(i) is full. 

[0085] These signals are applied to the input of the combination logic 25 and some of them are also applied to the 

fairness logics and arbitration sub-units associated with the two bus segments adjacent to B(i). 

[0086] The behaviour of the fairness logics is completely similar to that of the unfairness logic: 
20 [0087] The counter 1 4 timed by the clock signal, incremented at each clock pulse if enabled by an input signal which 

is the AND (determined by the gate 1 5) of the two signals Br(i-1 )OE and /BI(i)OE. 

[0088] Both here and subsequently the slash / has the logical significance of inversion or negation. 

[0089] Consequently, it increments with each period of the clock signal in which access is granted to the right buffer, 

if at the same time the left buffer is not empty. 
25 [0090] After a predetermined number of increments, for example three, the counter 1 4 is inhibited and a mask signal 

Br(i-1 )FM is set at the output. 

[0091] The counter 14 is reset by the setting of one or other of the two signals BI(i)OE, P(i)OE applied to a reset 
input through the OR gate 1 6. 

[0092] However, it could be reset simply by the setting of the signal BI(i)OE, thus making the OR gate 1 6 superfluous. 
30 [0093] Similarly, the counter 17 increments at each pulse of the clock signal, if enabled by an input signal which is 
the AND (determined by the gate 1 8) of the two signals BI(i)OE and /Br(i-1 )E. 

[0094] Consequently, it increments with each period of the clock signal in which access is granted to the left buffer 
and at the same time the right buffer is not empty. 

[0095] After a predetermined number of increments, for example three, the counter 1 7 is inhibited and a mask signal 
35 BI(i)FM is set at the output. 

[0096] The counter 1 7 is reset by the setting of one or other of the two signals Br(i-1)OE, P(i)OE applied to a reset 
input through the OR gate 19, or simply by the setting of the signal Br(i-1)OE. 

[0097] In the same way, the counter 20 increments at each pulse of the clock signal if enabled by an input signal 
which is the AND (determined by the gate 21) of the signals P(i)OE and one or other of the two signals /Br(i-1)E and 
40 /BI(i)E supplied in OR mode by the gate 22. 

[0098] Consequently, it increments with each period of the clock signal in which access is granted to the processor 
P(i) and at the same time at least one of the two buffers (right and left) is not empty. 

[0099] After a predetermined number of increments, for example three, the counter 20 is inhibited and a mask signal 

P(i)FM is set at the output. 

45 [01 00] The counter 20 is reset by the setting of one or other of the two signals BI(i)OE or Br(i-1 )OE applied to a reset 
input through the OR gate 23. 

[0101] The combination logic 25 receives at its input the signal P(i), the output signals from the fairness (and unfair- 
ness) logics, the output signals from the state machine 24 (MODEL) and the output signals from the round robin logic 4. 
[0102] It also receives from the adjacent arbitration sub-units 5 and 7 the state signals BI(i-1)F, Br(i)F relating to the 
50 left and right buffer respectively of the adjacent bus segments on the left and right. 

[0103] It also receives the signals generated, in the same arbitration cycle before this cycle is completed (one way 
of taking them into account in the same arbitration cycle), by the adjacent arbitration sub-units relating to the left and 
right buffers respectively, in other words the signals EBI(i-1)OE and EBr(i)OE. 

[0104] The letter E prefixed to the signal name indicates that this signal is in the output from the combination logic 
55 and is not that which is loaded into the output register, in other words it is anticipated. 

[0105] According to these signals, the combination logic 25 sets at the output one or other of the signals EBr(i-l) 
OE, EP(i)OE, EBI(i)OE which, when applied to the input of the output register 2, are loaded into the register at the time 
of the clock signal CLK which terminates the arbitration cycle and opens a new cycle. 
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[0106] The signals are set according to tine following logical operations: 

(1 ) Br(i-1 )OE = / Br(i-1 )E* [/BI(i)OE* /P(i)OE] * [BI(i)E + 
^ Pr] * [/Br(i-1)FM+UN.G] */UN(i)M* [/Br(i)F+EBr(i)OE]. (1) 

[0107] In other words, for Br(i-1 )OE to be set, it is necessary: 

10 1 ) that it is not empty ( /Br(i-1 )E) 

2) that there has been no access to the bus segment by other agents in the preceding cycle ( /BI(i)OE7P(i)OE) 
and therefore no recovery cycle is necessary 

3) that, as a result of the round robin cyclic priority, the right buffer takes priority over the left, or the left buffer is 
empty and therefore has no reason to obtain access to the bus segment ( BI(i)E + Pr) 

15 4) that the fairness mask of the right buffer is not set, or that the global unfairness mask is set, a condition used 

to avoid possible deadlocks (/ Br(i-1)FM+UN.G) 

5) that the unfairness mask of the processor connected to the same bus segment (/UN(i)M) is not set (also used 

to avoid possible deadlocks) 

6) that the destination buffer is not full, or that in the course of the same arbitration cycle it has been enabled to 
20 access the bus B(i+1) and therefore has an entry which is becoming free. 

(2) BI(i)OE = / BI(i)E* [/Br(i-1 )OE* /P(i)OE] * [Br(i-1 ) E + 
25 PI] * [/ BI(i)FM+UN.G] * /UN(i)M* [/Bl(i-1 )F+EBI(i-1 )OE]. 

[0108] In other words, criteria equivalent to those considered above are applicable. 

(3) P(i)OE = P(i)* /Br(i-1 )OE* /BI(i)OE* /EBr(i-1 )OE* 
/EBI(i) OE] * /P(i)FM* [/Br(i)F+EBr(i)OE] * [/Bl(i-1 )F + 

EBI(i-1)OE] 

35 

[0109] In other words, for P(i)OE to be set, it is necessary: 

1 ) that there is a request for access by the processor 

2) that no access to the bus segment has been given in the preceding cycle to other agents (/Br(i-1)OE /BI(i)OE), 
40 and therefore no recovery is necessary 

3) that no access has to be given to the Br(i-1 ) and Bl(i) in the same arbitration cycle because of their higher priority 

4) that the fairness mask of the processor does not have to be set 

5) that both destination buffers Br(i) and Bl(i-1) must have a free entry or must be about to become free. 

45 [0110] It is evident that the arbitration sub-units 5 and 8 which control access to the end segments of the bus are 
simplified with respect to those indicated previously, since the segments are provided with an input buffer and an output 
buffer at one end only. 

[0111] Conversely, it must be remembered that the access enabling signals P(1)OE and P(4)OE are received with 
a delay equal to one clock signal period because of the effect of the buffer 167 (Fig. 1) and this fact has to be taken 
50 into account in equations (1) and (2), using, in place of the signal P(i)OE at the output of the output register of the 
arbitration sub-unit, a corresponding signal obtained from P(i)OE with a further buffering level in cascade connection. 
[0112] The preceding description relates only to a preferred embodiment of the invention, but it is clear that many 
modifications may be made. 

[01 13] As stated previously, the FIFO buffers separating the different bus segments may have a depth variable from 
55 1 to more entries. 

[0114] Additionally, the arbitration criteria may be different from those described, although they would then be less 
efficient. 

[0115] For example, it is possible to simplify the arbitration logic by providing as the basic principle the granting of 
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access to each bus segment to the same agent which had obtained it previously, and the forcing of its release with the 
fairness mechanism. 

[0116] To distribute access to the different agents it is possible to use a round robin algorithm of the global type 
which, instead of alternating access priority between the right and left buffers only, alternates it cyclically between the 
right buffers, the left buffers and the processors connected to the different segments, thereby avoiding the necessity 
of using an unfairness mechanism. 

[0117] It is also possible to divide the arbitration unit 160 into a plurality of arbitration sub-units, each dedicated to 
one bus segment and decentralized spatially, intercommunicating through buffers for synchronizing and re-timing the 
different signals at the input and at the output of the arbitration sub-units. 

[0118] In any case, although the latency time or the time required to execute a transaction, in other words to transfer 
data from one processor to another, through a number of segments of the system bus may (in case of access conflict 
between the processors and buffers) be greater than the time required in the case of a conventional bus, the possibility 
of executing a number of transactions simultaneously enables a higher global performance or throughput to be 
achieved. 

[0119] This is so for two reasons: the bus can operate at a higher frequency, and a number of transactions can be 
executed in parallel, in other words with superimposition in time. 



Claims 

1. A computer system (100) comprising a multi-point communication bus (105), aplurality of units (U1 -U4) connected 
to the bus (1 05), and a timer unit (1 70) to generate a periodic system clock signal (CLK), the bus (1 05) comprising 
a plurality of bus segments (B1 ,...B(i) ,...B4), each bus segment (B(i)) being concatenated with at least one adjacent 
bus segment (B(i-1 ), B(i+1)) through concatenation buffer registers (Br(i-I), Bl(i)) to transfer a data item from the 
adjacent bus segment (B(i-1), B(i+1)) to the said bus segment (Bi), 

characterized in that it further comprises an arbitration unit (160) to control the simultaneous access, in a 
single period of the said clock signal, to the said plurality of bus segments and, for each bus segment (B(i)), by a 
unit (Ui) connected to the said bus segment (Bi) and, in a mutually exclusive way with the said unit, by one of the 
buffer registers (Br(i-I), Bl(i)) for concatenation of the said bus segment (B(i)) with one adjacent segment (B(i-1), 
B(i+1)). 

2. Computer system (100) according to Claim 1, in which the said concatenation buffer registers (Br1-Br3, BI1-BI3) 
include a bank of FIFO registers. 

3. Computer system (100) according to Claim 1 or 2, in which the arbitration unit (160) is connected to each unit 
(U1-U4) and to each of the concatenation buffer registers (Br1-Br3, BI1-BI3) by means of a dedicated line (161; 
168, 169) to send an access enabling signal, at least one of the dedicated lines comprising a plurality of portions 
of line (168, 169) concatenated by means of further buffer registers (167). 

4. Computer system (1 00) according to any of Claims 1 to 3, in which the said timer unit (1 70) is connected to a group 
of the said units (U2, U3) and of the said buffer registers (Br1-Br3, BI1-BI3) to supply the said clock signal (CLK) 
to the said group, and at least one synchronization unit (1 75) is connected to the said timer unit to receive the said 
clock signal (CLK) and is connected to a further group of the said units (UI , U4) and of the said buffer registers, 
to supply to the said further group a timing signal which is regenerated and synchronized with the said clock signal. 

5. Computer system (1 00) according to Claim 4, in which the said at least one synchronization unit (1 75) consists of 
a phase-locked loop (PLL). 

6. Computer system (1 00) according to any of Claims 1 to 5, in which at least one of the said units (UI -U4) consists 
of an interface bridge for connection to a local bus. 

7. System according to preceding claims, in which the said arbitration unit (1 60) comprises, for each of the said bus 
segments, a state machine (24) whose state represents the state of the said bus segment and of the buffer registers 
for concatenation of the said segment with at least one adjacent segment. 

8. System according to Claim 7, in which the said arbitration unit comprises, for each of the said bus segments, a 
combination logic (25) which permits consecutive accesses to the bus segment by a single agent, unit or concate- 
nation buffer register, for a plurality of consecutive periods of the said clock signal (CLK), if the said agent has data 
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to transfer and subject to the test of feasibility of eacli of tine accesses by the said combination logic (25), and, for 
each of the said agents, comprises fairness logic circuits (14, 15, 16, ... 22, 23) which, after a predetermined 
number of consecutive accesses, prevent further accesses by the same agent if another agent has data to transfer. 

9. System according to Claim 8, in which the said arbitration unit (160) assigns access priorities, for each segment, 
to the said concatenation buffer registers, if they are not empty, with respect to the said units connected to the 
segment, and in which the said arbitration unit comprises, for each segment, an unfairness logic (10) to prevent 
further accesses to the said segment by the said concatenation buffer registers if the said units connected to the 
segment have submitted a request for access to the said arbitration unit, for access to the said segment, for a 
predetermined number of consecutive periods of the said clock signal. 

10. System according to Claim 9, in which the said concatenation buffer registers of each bus segment comprise right 
buffers (Br(i)) for transferring data in a first direction of concatenation of the said bus segments and left buffers (Bl 
(i)) for transferring data in the opposite direction of concatenation, and in which the said arbitration unit (160) 
comprises a round robin logic (4) to alternate cyclically the relative priority between the said right and left buffers 
in access to the said bus segments. 



Patentanspruche 

1. Rechnersystem (100) umfassend einen Mehrpunkt-Kommunikationsbus (105), eine Vielzahl von Einheiten 
(U1 -U4), die mit dem Bus verbunden sind, und eine Zeitgebereinheit (1 70), um ein periodisches Systemtaktsignal 
(CLK) zu erzeugen, wobei der Bus (105) eine Vielzahl von Bussegmenten (Bl , ... B(i), ... B4) umfasst, wobei jedes 
Bussegment (B(i)) mit zumindest einem angrenzenden Bussegment (B(i-1 ), B(i+1 ) uber Verkettungspufferregister 
(Br(l-I), Bl(i)) verkettet ist, um ein Datenelement vom angrenzenden Bussegment (B(i-1), B(l+1)) zu diesem Bus- 
segment (Bi) zu ubertragen, 

dadurch gekennzeichnet, dass es ferner eine Entscheidungseinheit (160) umfasst, um in einem einzigen Zeit- 
raum des Taktsignals den gleichzeitigen Zugriff auf die Vielzahl von Bussegmenten, und fur jedes Bussegment (B 
(i)) durch eine Einheit (Ui), die mit dem Bussegment (Bi) verbunden ist, und in einer gegenseitig ausschlieBenden 
Weise mit dieser Einheit durch eines der Pufferregister (Br(i-I), Bl(i)) zur Verkettung des Bussegments (B(i)) mit 
einem angrenzenden Segment (B(i-1), B(l+1)) zu steuern. 

2. Rechnersystem (100) gemaB Anspruch 1, bei dem die Verkettungspufferregister (Br1-Br3, BI1-BI3) eine Reihe 
von FIFO-Registern umfasst. 

3. Rechnersystem (1 00) nach Anspruch 1 oder 2, bei dem die Entscheidungseinheit (1 60) mit jeder Einheit (UI -U4) 
und jedem der Verkettungspufferregister (Br1-Br3, BI1-BI3) mittels einer dedizierten Leitung (161; 168, 169) ver- 
bunden ist, um ein Zugriffsfreigabesignal zu senden, wobei zumindest eine der dedizierten Leitungen eine Vielzahl 
von Bereichen der Leitung (168, 169) umfasst, die mittels weiterer Pufferregister (167) verkettet ist. 

4. Rechnersystem (1 00) nach einem der Anspruche 1 bis 3, bei dem die Zeitgebereinheit (1 70) mit einer Gruppe der 
Einheiten (U2, U3) und der Pufferregister (Br1-Br3, BI1-BI3) verbunden ist, um fur diese Gruppe das Taktsignal 
(CLK) bereitzustellen, und bei dem zumindest eine Synchronisierungseinheit (175) mit der Zeitgebereinheit ver- 
bunden ist, um das Taktsignal (CLK) zu empfangen und mit einer weiteren Gruppe der Einheiten (UI , U4) und der 
Pufferregister verbunden ist, um fur diese Gruppe ein Synchronisierungssignal, das mit dem Taktsignal regeneriert 
und synchronisiert wird, bereitzustellen. 

5. Rechnersystem (100) nach Anspruch 4, bei dem die zumindest eine Synchronisierungseinheit (175) aus einem 
Phase nregelkre is (PLL) gebildet ist. 

6. Rechnersystem (1 00) nach einem der Anspruche 1 bis 5, bei dem zumindest eine der Einheiten (UI -U4) aus einer 
Schnittstellenbrucke zum Verbinden mit einem lokalen Bus gebildet ist. 

7. System nach den vorhergehenden Anspruchen, bei dem die Entscheidungseinheit (160) fur jedes der Busseg- 
mente eine Zustandsvorrichtung (24) umfasst, deren Zustand den Zustand des Bussegments und der Pufferregi- 
ster zur Verkettung des Segments mit zumindest einem angrenzenden Segment darstellt. 

8. System nach Anspruch 7, bei dem die Entscheidungseinheit fur jedes der Bussegmente eine Kombinationslogik 
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(25) umfasst, welche aufeinander folgende Zugriffe auf das Bussegment durch eine einzelne Einrichtung, Einheit 
Oder Verkettungspufferregister, fur eine Vielzahl von aufeinander folgenden Zeitraumen des Taktsignals (CLK) 
ermoglicht, wenn diese Einrichtung Daten zu ubertragen liat und abhangig von der Uberprufung der Durclifuhr- 
barkeit jedes der Zugriffe durch die Kombinationslogik (25), und fur jede dieser Einrichtungen Gerechtigkeitslo- 
gikschaltkreise (14, 15, 16, 22, 23) umfasst, die nach einer vorbestimmten Anzahl von aufeinander folgenden 
Zugriffen weitere Zugriffe durch die gleiche Einrichtung verhindert, wenn eine andere Einrichtung Daten zu uber- 
tragen hat. 

9. System nach Anspruch 8, be! dem die Entscheidungseinheit (160) fur jedes Segment Zugriffsprioritaten zu den 
Verkettungspufferregistern zuweist, wenn diese nicht leer sind, in Bezug auf die Einheiten, die mit dem Segment 
verbunden sind, und bei dem die Entscheidungseinheit fur jedes Segment eine Ungerechtigkeitslogik (1 0) umfasst, 
urn weitere Zugriffe auf das Segment durch die Verkettungspufferregister zu verhindern, wenn die Einheiten, die 
mit dem Segment verbunden sind, eine Zugriffsanforderung an die Entscheidungseinheit ausgegeben haben, um 
Zugriff auf das Segment fur eine vorbestimmte Anzahl von aufeinander folgenden Zeitraumen des Taktsignals zu 
erhalten. 

10. System nach Anspruch 9, bei dem die Verkettungspufferregister jedes Bussegments rechte Puffer (Br(i)) zum 
Ubertragen von Daten in einer ersten Verkettungsrichtung des Bussegments und linke Puffer (Bl(i)) zum Ubertra- 
gen von Daten in der entgegengesetzten Verkettungsrichtung umfassen, und bei dem die Entscheidungseinheit 
(160) eine Reigenlogik (4) umfasst, um zyklisch die relative Prioritat zwischen den rechten und den linken Puffern 
beim Zugriff auf die Bussegmente abzuwechseln. 



Revendications 

1. Systeme d'ordinateur (100) qui comprend un bus de communication multipoint (105), une pluralite d'unites (U1 - 
U4) qui sont connectees au bus (105) et une unite de minuterie (170) pour generer un signal d'horloge systeme 
periodique (CLK), le bus (1 05) comprenant une pluralite de segments de bus (B1 , B(i), B4), chaque segment 
de bus (B(i)) etant concatene avec au moins un segment de bus adjacent (B(i-1), (B(i+1)) par I'intermediaire de 
registres tampon de concatenation (Br(i-I), Bl(i)) afin de transferer un element de donnees depuis le segment de 
bus adjacent (B(i-1 ), (B(i+1 )) jusqu'audit segment de bus (Bi), 

caracterise en ce qu'il comprend en outre une unite d'arbitrage (160) pour commander I'acces simultane, 
dans une unique periode dudit signal d'horloge, a ladite pluralite de segments de bus et, pour chaque segment 
de bus (B(i)), au moyen d'une unite (Ui) qui est connectee audit segment de bus (Bi) et, d'une fagon mutuellement 
exclusive par rapport a ladite unite, au moyen de I'un des registres tampon (Br(i-I), Bl(i)) pour une concatenation 
dudit segment de bus (B(i)) avec un segment adjacent (B(i-1), B(i+1)). 

2. Systeme d'ordinateur (100) selon la revendication 1, dans lequel lesdits registres tampon de concatenation 
(Br1-Br3, B11-B13) incluent un groupement de registres FIFO. 

3. Systeme d'ordinateur (100) selon la revendication 1 ou 2, dans lequel I'unite d'arbitrage (160) est connectee a 
chaque unite (U1-U4) et a chacun des registres tampon de concatenation (Br1-Br3, B11-B13) au moyen d'une 
ligne dediee (161 ;168, 169) afin d'envoyer un signal de validation d'acces, au moins I'une des lignes dediees 
comprenant une pluralite de parties de ligne (168, 169) qui sont concatenees au moyen de registres tampon 
supplementaires (167). 

4. Systeme d'ordinateur (1 00) selon I'une quelconque des revendications 1 a 3, dans lequel ladite unite de minuterie 
(170) est connectee a un groupe desdites unites (U2, U3) et desdits registres tampon (Br1-Br3, B11-B13) pour 
appliquer ledit signal d'horloge (CLK) sur ledit groupe, et au moins une unite de synchronisation (1 75) est connectee 
a ladite unite de minuterie pour recevoir ledit signal d'horloge (CLK) et est connectee a un autre groupe desdites 
unites (UI, U4) et desdits registres tampon pour appliquer sur ledit autre groupe un signal de cadencement qui 
est regenere et synchronise avec ledit signal d'horloge. 

5. Systeme d'ordinateur (100) selon la revendication 4, dans lequel ladite au moins une unite de synchronisation 
(175) est constituee par une boucle a verrouillage de phase (PLL). 

6. Systeme d'ordinateur (1 00) selon I'une quelconque des revendications 1 a 5, dans lequel au moins I'une desdites 
unites (U1-U4) est constituee par un pont d'interface pour une connexion sur un bus local. 
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Systeme d'ordinateur (100) selon I'unedes revendications precedentes, dans lequel ladite unite d'arbitrage (160) 
comprend, pour chacun desdits segments de bus, una machine d'etat (24) dont un etat represente I'etat dudit 
segment de bus et des registres tampon pour une concatenation dudit segment avec au moins un segment adja- 
cent. 

Systeme selon la revendication 7, dans lequel ladite unite d'arbitrage (160) comprend, pour chacun desdits seg- 
ments de bus, une logique de combinaison (25) qui permet des acces consecutifs au segment de bus au moyen 
d'un unique agent, d'une unique unite ou d'un unique registre tampon de concatenation pendant une pluralite de 
periodes consecutives dudit signal d'horloge (CLK), si ledit agent dispose de donnees a transferer et a soumettre 
au test de faisabilite de chacun des acces au moyen de ladite logique de combinaison (25) et pour chacun desdits 
agents, comprend des circuits logiques d'equite (1 4, 1 5, 1 6, 22, 23) qui, apres un nombre predetermine d'acces 
consecutifs, empechent des acces supplementaires par le meme agent si un autre agent dispose de donnees a 
transferer. 

Systeme selon la revendication 8, dans lequel ladite unite d'arbitrage (160) assigne des priorites d'acces pour 
chaque segment auxdits registres tampon de concatenation s'ils ne sont pas vides en relation avec lesdites unites 
qui sont connectees au segment et dans lequel ladite unite d'arbitrage comprend, pour chaque segment, une 
logique d'equite (10) pour empecher des acces supplementaires audit segment par lesdits registres tampon de 
concatenation si lesdites unites qui sont connectees au segment ont soumis une requete pour un acces a ladite 
unite d'arbitrage, pour acceder audit segment, pendant un nombre predetermine de periodes consecutives dudit 
signal d'horloge. 

Systeme selon la revendication 9, dans lequel lesdits registres tampon de concatenation de chaque segment de 
bus comprennent des tampons a droite (Br(i)) pour transferer des donnees suivant une premiere direction de 
concatenation desdits segments de bus et des tampons a gauche (Bl(i)) pour transferer des donnees suivant la 
direction opposee de concatenation et dans lequel ladite unite d'arbitrage (160) comprend une logique a tour de 
role (4) pour alterner cycliquement la priorite relative entre lesdits tampons a droite et a gauche lors de I'acces 
auxdits segments de bus. 
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